104 research outputs found
Critical assessment of methods of protein structure prediction: Progress and new directions in round XI
Modeling of protein structure from amino acid sequence now plays a major role in structural biology. Here we report new
developments and progress from the CASP11 community experiment, assessing the state of the art in structure modeling.
Notable points include the following: (1) New methods for predicting three dimensional contacts resulted in a few spectacular
template free models in this CASP, whereas models based on sequence homology to proteins with experimental structure
continue to be the most accurate. (2) Refinement of initial protein models, primarily using molecular dynamics related
approaches, has now advanced to the point where the best methods can consistently (though slightly) improve nearly all
models. (3) The use of relatively sparse NMR constraints dramatically improves the accuracy of models, and another type of
sparse data, chemical crosslinking, introduced in this CASP, also shows promise for producing better models. (4) A new
emphasis on modeling protein complexes, in collaboration with CAPRI, has produced interesting results, but also shows the
need for more focus on this area. (5) Methods for estimating the accuracy of models have advanced to the point where they
are of considerable practical use. (6) A first assessment demonstrates that models can sometimes successfully address biological
questions that motivate experimental structure determination. (7) There is continuing progress in accuracy of modeling
regions of structure not directly available by comparative modeling, while there is marginal or no progress in some other
areas
New encouraging developments in contact prediction: Assessment of the CASP11 results
This article provides a report on the state-of-the-art in the prediction of intra-molecular residue-residue contacts in proteins
based on the assessment of the predictions submitted to the CASP11 experiment. The assessment emphasis is placed on the
accuracy in predicting long-range contacts. Twenty-nine groups participated in contact prediction in CASP11. At least eight
of them used the recently developed evolutionary coupling techniques, with the top group (CONSIP2) reaching precision of
27% on target proteins that could not be modeled by homology. This result indicates a breakthrough in the development of
methods based on the correlated mutation approach. Successful prediction of contacts was shown to be practically helpful
in modeling three-dimensional structures; in particular target T0806 was modeled exceedingly well with accuracy not yet
seen for ab initio targets of this size (>250 residues
Assessment of protein disorder region predictions in CASP10
A systematic analysis of intrinsic disorder in proteins
started at the turn of the century1–4 and still remains a
hot research topic.5 Only this year several papers covering
general aspects of protein disorder have been published5–
9 and the discussion on the fundamental
principles of disorder continues to unfold.10,11 PubMed
search with the keywords “intrinsically disordered protein
2012” and “intrinsically disordered protein 2013”
returned 525 and 305 entries, respectively (as of April
2013). The number of experimentally verified intrinsically
disordered proteins and regions is steadily increasing.
The DisProt database12 currently contains
annotations for 684 intrinsically disordered proteins,
1513 disordered regions, and describes 38 different biological
functions associated with disordered regions. The
more recently established IDEAL database also has a
number of useful annotations on disordered proteins.13
Such a high interest in this area of research triggered
rapid development of computational methods for prediction
of the location of disordered regions in proteins. The
recently published reviews and assessment papers14–18
altogether provide a comprehensive analysis of more than
fifty disorder prediction methods. An independent assessment
of the protein disorder methods within the scope of CASP started in 2002 and is now already in its sixth
round.18–22 This study analyzes the results obtained by
the 28 disorder prediction groups participating in CASP10
Recommended from our members
Evaluation of model refinement in CASP13.
Performance in the model refinement category of the 13th round of Critical Assessment of Structure Prediction (CASP13) is assessed, showing that some groups consistently improve most starting models whereas the majority of participants continue to degrade the starting model on average. Using the ranking formula developed for CASP12, it is shown that only 7 of 32 groups perform better than a "naïve predictor" who just submits the starting model. Common features in their approaches include a dependence on physics-based force fields to judge alternative conformations and the use of molecular dynamics to relax models to local minima, usually with some restraints to prevent excessively large movements. In addition to the traditional CASP metrics that focus largely on the quality of the overall fold, alternative metrics are evaluated, including comparisons of the main-chain and side-chain torsion angles, and the utility of the models for solving crystal structures by the molecular replacement method. It is proposed that the introduction of these metrics, as well as consideration of the accuracy of coordinate error estimates, would improve the discrimination between good and very good models.Wellcome Trust
Marie Sklowdowska-Curie grain for EU Horizon 202
Evaluation of template-based models in CASP8 with standard measures
The strategy for evaluating template-based models submitted to CASP has continuously evolved from CASP1 to CASP5, leading to a standard procedure that has been used in all subsequent editions. The established approach includes methods for calculating the quality of each individual model, for assigning scores based on the distribution of the results for each target and for computing the statistical significance of the differences in scores between prediction methods. These data are made available to the assessor of the template-based modeling category, who uses them as a starting point for further evaluations and analyses. This article describes the detailed workflow of the procedure, provides justifications for a number of choices that are customarily made for CASP data evaluation, and reports the results of the analysis of template-based predictions at CASP8
Recommended from our members
Evaluation of template-based modeling in CASP13.
Performance in the template-based modeling (TBM) category of CASP13 is assessed here, using a variety of metrics. Performance of the predictor groups that participated is ranked using the primary ranking score that was developed by the assessors for CASP12. This reveals that the best results are obtained by groups that include contact predictions or inter-residue distance predictions derived from deep multiple sequence alignments. In cases where there is a good homolog in the wwPDB (TBM-easy category), the best results are obtained by modifying a template. However, for cases with poorer homologs (TBM-hard), very good results can be obtained without using an explicit template, by deep learning algorithms trained on the wwPDB. Alternative metrics are introduced, to allow testing of aspects of structural models that are not addressed by traditional CASP metrics. These include comparisons to the main-chain and side-chain torsion angles of the target, and the utility of models for solving crystal structures by the molecular replacement method. The alternative metrics are poorly correlated with the traditional metrics, and it is proposed that modeling has reached a sufficient level of maturity that the best models should be expected to satisfy this wider range of criteria
A Comprehensive Analysis of the Structure-Function Relationship in Proteins Based on Local Structure Similarity
BACKGROUND:Sequence similarity to characterized proteins provides testable functional hypotheses for less than 50% of the proteins identified by genome sequencing projects. With structural genomics it is believed that structural similarities may give functional hypotheses for many of the remaining proteins. METHODOLOGY/PRINCIPAL FINDINGS:We provide a systematic analysis of the structure-function relationship in proteins using the novel concept of local descriptors of protein structure. A local descriptor is a small substructure of a protein which includes both short- and long-range interactions. We employ a library of commonly reoccurring local descriptors general enough to assemble most existing protein structures. We then model the relationship between these local shapes and Gene Ontology using rule-based learning. Our IF-THEN rule model offers legible, high resolution descriptions that combine local substructures and is able to discriminate functions even for functionally versatile folds such as the frequently occurring TIM barrel and Rossmann fold. By evaluating the predictive performance of the model, we provide a comprehensive quantification of the structure-function relationship based only on local structure similarity. Our findings are, among others, that conserved structure is a stronger prerequisite for enzymatic activity than for binding specificity, and that structure-based predictions complement sequence-based predictions. The model is capable of generating correct hypotheses, as confirmed by a literature study, even when no significant sequence similarity to characterized proteins exists. CONCLUSIONS/SIGNIFICANCE:Our approach offers a new and complete description and quantification of the structure-function relationship in proteins. By demonstrating how our predictions offer higher sensitivity than using global structure, and complement the use of sequence, we show that the presented ideas could advance the development of meta-servers in function prediction
Assessment of chemical-crosslink-assisted protein structure modeling in CASP13
International audienceWith the advance of experimental procedures obtaining chemical crosslinking information is becoming a fast and routine practice. Information on crosslinks can greatly enhance the accuracy of protein structure modeling. Here, we review the current state of the art in modeling protein structures with the assistance of experimentally determined chemical crosslinks within the framework of the 13th meeting of Critical Assessment of Structure Prediction approaches. This largest‐to‐date blind assessment reveals benefits of using data assistance in difficult to model protein structure prediction cases. However, in a broader context, it also suggests that with the unprecedented advance in accuracy to predict contacts in recent years, experimental crosslinks will be useful only if their specificity and accuracy further improved and they are better integrated into computational workflows
Target highlights in CASP9: Experimental target structures for the critical assessment of techniques for protein structure prediction
15 pags, 9 figsOne goal of the CASP community wide experiment on the critical assessment of techniques for protein structure prediction is to identify the current state of the art in protein structure prediction and modeling. A fundamental principle of CASP is blind prediction on a set of relevant protein targets, that is, the participating computational methods are tested on a common set of experimental target proteins, for which the experimental structures are not known at the time of modeling. Therefore, the CASP experiment would not have been possible without broad support of the experimental protein structural biology community. In this article, several experimental groups discuss the structures of the proteins which they provided as prediction targets for CASP9, highlighting structural and functional peculiarities of these structures: the long tail fiber protein gp37 from bacteriophage T4, the cyclic GMP-dependent protein kinase Iβ dimerization/docking domain, the ectodomain of the JTB (jumping translocation breakpoint) transmembrane receptor, Autotaxin in complex with an inhibitor, the DNA-binding J-binding protein 1 domain essential for biosynthesis and maintenance of DNA base-J (β-D-glucosyl-hydroxymethyluracil) in Trypanosoma and Leishmania, an so far uncharacterized 73 residue domain from Ruminococcus gnavus with a fold typical for PDZ-like domains, a domain from the phycobilisome core-membrane linker phycobiliprotein ApcE from Synechocystis, the heat shock protein 90 activators PFC0360w and PFC0270w from Plasmodium falciparum, and 2-oxo-3-deoxygalactonate kinase from Klebsiella pneumoniae. © 2011 Wiley-Liss, Inc.Grant sponsor: Spanish Ministry of Education and Science; Grant number: BFU2008-01588; Grant sponsor: European Commission; Grant number: NMP4-CT-2006-033256; Grant sponsor: Spanish Ministry of Education and Science (José Castillejo fellowship); Grant sponsor: Xunta de Galicia (Angeles Alvariño fellowship); Grant sponsor: National Institutes of Health; Grant numbers: K22-CA124517 (D.E.C.); R01-GM090161 (C.K.) GM074942; GM094585; Grant sponsor: U. S. Department of Energy, Office of Biological and Environmental Research; Grant number: DE-AC02-06CH11357 (to A.J.); Grant sponsor: Foundation for Polish Science (to K.M.); Grant sponsor: NSF; Grant number: DBI 0829586
- …